75 research outputs found
String Matching with Multicore CPUs: Performing Better with the Aho-Corasick Algorithm
Multiple string matching is known as locating all the occurrences of a given
number of patterns in an arbitrary string. It is used in bio-computing
applications where the algorithms are commonly used for retrieval of
information such as sequence analysis and gene/protein identification.
Extremely large amount of data in the form of strings has to be processed in
such bio-computing applications. Therefore, improving the performance of
multiple string matching algorithms is always desirable. Multicore
architectures are capable of providing better performance by parallelizing the
multiple string matching algorithms. The Aho-Corasick algorithm is the one that
is commonly used in exact multiple string matching algorithms. The focus of
this paper is the acceleration of Aho-Corasick algorithm through a multicore
CPU based software implementation. Through our implementation and evaluation of
results, we prove that our method performs better compared to the state of the
art
AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments
Plagiarism is one of the growing issues in academia and is always a concern
in Universities and other academic institutions. The situation is becoming even
worse with the availability of ample resources on the web. This paper focuses
on creating an effective and fast tool for plagiarism detection for text based
electronic assignments. Our plagiarism detection tool named AntiPlag is
developed using the tri-gram sequence matching technique. Three sets of text
based assignments were tested by AntiPlag and the results were compared against
an existing commercial plagiarism detection tool. AntiPlag showed better
results in terms of false positives compared to the commercial tool due to the
pre-processing steps performed in AntiPlag. In addition, to improve the
detection latency, AntiPlag applies a data clustering technique making it four
times faster than the commercial tool considered. AntiPlag could be used to
isolate plagiarized text based assignments from non-plagiarised assignments
easily. Therefore, we present AntiPlag, a fast and effective tool for
plagiarism detection on text based electronic assignments
- …